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PREFACE 


This  report  consists  of  three  volumes  which  present  the  theory 
and  application  of  a valuable  data  reduction  tool,  the  analysis  of 
covariance.  Volume  I introduces  the  analysis  of  covariance  as  a general 
linear  model  (GLM)  and  then  expands  the  model  to  incorporate  the  multi- 
variate case,  unequal  sample  size,  and  missing  observations  on  the 
response  variable.  Volume  I also  covers  the  analysis  of  covariance  for 
nonparametric  data. 

Volumes  II  and  III  were  prepared  by  the  Department  of  Statistics, 
Oklahoma  State  University,  Stillwater,  Oklahoma  74074,  under  Air  Force 
Contract  F08635-76-C-0154,  with  the  Air  Force  Armament  Laboratory, 
Armament  Development  and  Test  Center,  Eglin  Air  Force  Base,  Florida 
32542.  The  contract  dealt  with  the  development  and  programming  of  the 
methodology  for  evaluating  multiple  variable  data  with  missing 
observations  on  dependent  and  independent  variables  by  the  analysis  of 
covariance  method.  The  methodology  also  covers  case  for  unequal  sample 
size.  This  work  was  begun  in  January  1976  and  completed  in  December  1976. 
This  is  Volume  II. 

This  technical  report  has  been  reviewed  by  the  Information  Officer 
(01)  and  is  releasable  to  the  National  Technical  Information  Service 
(NTIS) . At  NTIS  it  will  be  available  to  the  general  public,  including 
foreign  nations. 

This  technical  report  has  been  reviewed  and  is  approved  for 
publication. 
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SECTION  I 


INTRODUCTION  AND  PROBLEM  STATEMENT 

The  main  purpose  of  this  study  is  to  extend  work  done  on  estimation 
and  hypothesis  testing  problems  for  multivariate  linear  models  describing 
situations  that  cannot  be  analyzed  under  the  Standard  Multivariate  (SM) 
general  linear  model.  Kleinbaum  (9)  has  developed  the  theory  to  deal  with 
the  Growth  Curve  Multivariate  (GCM)  model  and  the  More  General  Linear 
Multivariate  (MGLM)  model  which  Is  applicable  to  the  problem  of  missing 
observations  among  the  dependent  variables  in  the  SM  model  with  known 
design  matrix.  The  author  proposes  to  extend  the  results  of  Kleinbaum  to 
handle  an  analysis  of  covariance  model  with  missing  observations  among 
the  independent  variables  or  covariates  as  well  as  among  the  dependent 

variables. 

The  Multivariate  Analysis  of  Covariance  (MAC)  model  is  based  on  the 
multivariate  linear  model 

E(Y)  = Xa  + Zg  and 

Var(Y)  =In>£  (1) 

where  Y is  an  nxp  matrix  composed  of  p-varlate  responses  on  n 
Individuals, 

X Is  an  nxmx  known  design  matrix  of  rank  R(X)  = rx(smxsn) 
corresponding  to  the  classlflcatory  variables  of  the  model, 
a Is  an  nycp  matrix  of  unknown  parameters. 
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z 


is  an  nxmz  matrix  composed  of  concomitant  variables,  in 
the  sense  that  the  constant  elements  of  Z are  not  necessarily 
planned  in  advance  by  the  experimenter.  R(Z)  = ^2(<mz  <n), 

6 is  an  mzxp  matrix  of  unknown  concomitant  parameters, 

^ (ors)  is  a PXP  positive  definite  matrix  of  usually  unknown 

parameters  which  represents  the  variance-covariance  matrix 
of  any  row  of  Y, 

and  A«B  is  the  Kronecker  Product  of  the  matrices  A and  B. 

It  is  clear  from  Equation  (1)  that,  in  the  MAC  model,  the  measurements  on 
different  individuals  are  assumed  to  be  uncorrelated  whereas  the  measure- 
ments of  the  p response  variates  on  the  same  individual  may  be  correlated. 

The  MAC  mcdel  may  be  more  concisely  represented  by  using  the  follow- 
ing definitions: 


A = [x  : Z]  is  the  nxm  design  matrix  constructed  by  horizontally 

augmenting  the  design  matrix  X by  the  matrix  Z where 

m = m + m , 
x z 


y = l gj  is  the  mxp  matrix  of  unknown  parameters  constructed 
by  vertically  augmenting  the  parameter  matrix  a by  the  para- 
meter matrix  8. 

Thus,  the  MAC  model  may  be  written  as  follows: 

E(Y)  = Ay  and 

VarlY)  =!„.£.  (2) 


VARIATE-WISE  REPRESENTATION  OF  THE  MAC  MODEL 


The  MAC  model  may  be  alternatively  represented  in  a variate-wise 
representation  by  making  the  following  definitions: 

is  the  nxl  vector  which  denotes  the  sth  (s  = l,...,p)  column 
of  Y, 


2 


and  is  mxl  vector  which  denotes  the  sth  (s  = l,...,p)  column  of  y. 
Thus,  ... 

and  y “ [ X]  r2  •••  Ip] 
so  that  the  MAC  model  may  be  described  as 

(3) 

E^)  = Al^.  s = 1 ,2 and 

Cov(y^,ys)  = ar$In  for  ali  r,s  = l,2,...,p. 

Thus,  the  variate-wise  representation  consists  of  p univariate  models 
corresponding  to  the  p variates.  These  p separate  univariate  models 
are  related  by  the  ^ covariances  between  the  different  variate  pairs. 

VECTOR  REPRESENTATION  OF  THE  MAC  MODEL 

The  vector  representation  of  the  MAC  Model  is  obtained  by  making  the 
following  definitions: 


*■1 

“ - 

*i 

Let  y = 

*■2 

and  x = 

—2 

Thus , 

E(y)  = D V and  ^ 

n 

Var(y)  = n, 

where  = Ip  a A and  u =1  a Ip  . 

ESTIMATION  AND  HYPOTHESES  TESTING  IN  THE  MAC  MODEL 

Rao(12)  using  generalized  inverses  has  shown  for  the  SM  model  that 

the  best  linear  unbiased  estimate  (BLUE)  of  a linear  function  of  the 

elements  of  the  parameter  matrix,  when  estimable,  is  given  by  the  sum 

of  the  BLUE's  obtained  separately  from  the  univariate  models  resulting 
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from  the  variate-wise  representation.  For  estimating  an  estimable  linear 
set  of  elements  of  the  parameter  matrix,  Roy  Cl  3 ) suggests  using  the  sum 
of  the  BLUE's  for  the  linear  sets  obtained  separately  from  the  univariate 
models.  The  results  of  Rao  and  Roy  can  easily  be  extended  to  the  MAC 
model . 

The  general  linear  hypothesis  for  the  MAC  model  can  be  expressed  in 
the  same  form  as  is  usual  for  the  SM  model  for  which  a number  of  test 
procedures  have  been  proposed.  For  example,  Wilk's  Likelihood  Ratio, 
Hotelling's  Trace  (TQ2),  and  Roy's  Largest  Root  are  the  tests  most  commonly 
used  in  practice.  Explanations  of  these  tests  can  be  found  in  standard 
texts  on  multivariate  analysis  such  as  Anderson(3)  and  Morrison  (10). 

EXPERIMENTAL  SITUATIONS  IN  WHICH  THE  MAC  MODEL  DOES  NOT  APPLY 

The  MAC  model,  as  defined  in  Equations  (1),  (2),  (3),  and  (4),  involves 
three  assumptions  which  are  not  always  met  in  practice  due  to  failure 
or  inability  to  obtain  complete  observations  on  all  experimental  units. 

These  assumptions  are: 

1.  A response  is  observed  on  each  variate  on  all  experimental  units. 

2.  The  design  matrix,  X,  is  the  same  for  each  response  variate. 

3.  Each  concomitant  response  is  observed  on  each  experimental 
uni  t. 

In  general , the  above  assumptions  are  met  in  the  initial  design 
of  an  experiment  unless  it  is  physically  impossible  or  uneconomical  to 
observe  a response  on  each  variate.  But  even  when  the  experiment  is  initially 
designed  to  conform  to  the  above  assumptions,  missing  observations  can 
occur  among  the  independent  as  well  as  the  dependent  variables  due  to  the 
occurrence  of  some  unfortunate  event  such  as  the  dropping  of  a test  tube, 
the  failure  of  an  electronic  instrument,  or  the  death  of  a subject  before 
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responses  are  observed  on  each  variate.  These  events  could  be  considered 
random  in  the  sense  that  their  occurrence  is  equally  likely  for  each 
experimental  unit. 

Any  failure  of  the  experimental  data  to  conform  to  the  above 
assumptions  yields  the  MAC  model  inappropriate  for  analyzing  the 
experiment  based  on  all  observed  data,  because  any  experimental 
units  on  which  one  or  more  dependent  and/or  independent  responses  are 
missing  requires  the  total  deletion  of  that  experimental  unit.  Thus, 
the  development  of  a procedure  utilizing  all  the  sample  information 
would  be  a valuable  contribution  to  the  analysis  of  such  experiments. 


SECTION  II 


LITERATURE  REVIEW 

Allan  and  Wishart  (1)  were  probably  the  first  to  consider  the  problem 
of  missing  data  in  statistical  analysis,  whereas  Yates  (16)  was  the  first 

to  present  a general  solution  using  a least-squares  method  of  substituting, 

for  missing  values  in  a designed  experiment.  Wilks  (15)  discussed  both 
a maximum  likelihood  approach  and  a method-of-moments  approach  to  the 
problem  of  missing  values  in  regression  analysis. 

Zyskind,  Kempthorne,  et  al  (17)  present  a very  thorough  treatment 
of  the  analysis  of  covariance  technique,  first  introduced  by  Bartlett  (4), 
to  a univariate  linear  model  with  missing  observations  occurring  on  the 
dependent  variable.  They  approach  the  problem  by  partitioning  the 
model 

E(y_)  s Xa  and 

Var(yJ  = a2  In 

so  that  it  may  be  written 

Xl 

E(*)  = UU 
*2 

where  is  an  nxl  vector  of  observations, 

X1 

X = (Y  ) is  an  nxp  known  design  matrix  of  full  rank  p<n, 

*2 

and  a is  a pxl  vector  of  unknown  parameters. 

In  general  the  computational  formula  for  the  fitting  of  a full  model 

of  the  form  [Equation  (2)]  is  used  where  the  data  corresponding  to  the  vec- 
tor X-|Ci  of  m components  are  missing  or  are  simply  not  available.  Thus, 


(5) 


(6) 


the  model  to  be  fitted  is  E(^)  = a s°l'J^''oin  t°  the  normal  equa- 

tions X'2  X2  a = X'2  ^ is  not  immediate,  whereas  a solution  to  the  normal 
equations  corresponding  to  the  full  set  of  data  is  standard.  They  capitalize 
on  the  available  information  by  considering  the  following  analysis  of  co- 
variance  model  form: 


m 

) 

n-m,m 


)B. 


(7) 


where  Im  is  an  mxm  identity  matrix.  Since  the  sum  of  squares  of  deviations 
of  the  observations  from  their  expected  values  for  the  model  [Equation  (7)]  and 
the  model  E^)  = X^a  are  minimized  for  identical  sets  of  values  for  the 
vector  a,  the  computations  required  for  fitting  the  model  E(^)  = X?a 
can  be  performed  on  the  corresponding  analysis  of  covariance  model.  Then 
using  the  facts:  (i)  that  for  the  model 

E(yJ  = Xa  + Z6^  (8) 

the  full  set  of  normal  equations 

X'Xa  + X’ZB  = X'y.  (9) 

Z 1 Xa  + Z'ZB  = Z’y.  (10) 

can  be  equivalently  expressed  as 
X'Xa  + X’ZB  = X’y_ 


[(I  -X(X'X)"X' ) z] 1 [(I  - X(X ' X)-X)Z  ] = [(I  - X(X'X)"X,)Z]'  (11) 

and  (ii)  that  if  A/ a is  an  estimable  parametric  function  for  the  model 

y X 

E (jO  = x?a  and  if  for  the  model  E(y)  = (J)  = (J)a  the  BLUE  of  x'a 
c c 1 2 


t2>  n2 
is  given  by: 


V*!  + *2  ±2  ; 02) 

the  BLUE  of  X_'a  for  the  model  E(^)  = X2a  is  given  by 

+ ^13) 

where  b.  is  obtained  by  solving  the  error  normal  Equations  (11)  where 
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(13) 


> 


i 


F 


1 = and  l = 

Thus,  6 in  Equation  (13)  plays  the  role  of  in  the  point 
estimation  of  A_‘a  for  the  model  E(y)  = ( ) = Xa.  It  would 

appear  that  one  could  easily  extend  the  results  of  Zyskind,  Kempthorne 
et  al  to  handle  the  problem  of  missing  responses  among  the  dependent 
variables  of  a multivariate  linear  model.  However,  this  is  not  the  case 

due  to  the  dependence  of  their  solution  upon  the  fact  that  the  residual 
sum  of  squares  for  the  model  [Equation  (7)]  and  the  model  E^)  = are 

identical  for  identical  sets  of  values  for  the  vector  a which  is  not 
guaranteed  in  the  multivariate  case  due  to  the  covariate  structure  among 
responses  from  the  same  experimental  unit. 

Haitovsky  (7)  compares  two  alternative  methods  for  dealing  with  the 
problem  of  missing  observations  among  the  independent  variables  and/or  the 
dependent  variables  in  a univariate  regression  model.  One  method  (Method 
1)  is  simply  to  discard  all  incomplete  observations  and  then  apply  the 
ordinary  least-squares  technique  to  the  complete  observations.  The  other 
method  (Method  2)  consists  of  computing  the  covariances  between  all  pairs 
of  variables,  each  time  using  only  the  observations  having  values  of 
both  variables,  and  to  use  these  covariances  in  constructing  the  system 
of  normal  equations. 

Cov(xi-,  Xj)B  = Cov(xi,y),  (i,j  = 1,  ••*,  m),  where  Cov(x.Xj)  ^4) 

is  the  mxm  covariance  matrix  in  which  the  (i,j)th  element  (i,j  = l,...,m) 

is  computed  from  the  measurements  common  to  both  x..  and  Xj  (i/j)  as  well 

as  from  all  the  existing  measurements  on  x^  for  i=j,  and  similarly  for 

Cov(x.j,y)  (i  = l,...,m).  The  comparison  was  made  using  Monte  Carlo 

techniques  since  Method  2 does  not  have  optimal  statistical  properties 

and  since  the  derivation  of  its  distribution  theory  is  intractable.  Comparing 

the  two  methods  with  regard  to  unbiasedness  and  efficiency  indicated 
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that  Method  2 was  superior  only  in  the  rare  ease  in  which  9 to  1U  percent 
of  the  observations  were  complete  and  hence  available  for  use  in  Method  1. 
By  decomposing  the  Mean  Square  Error  (MSE)  into  one  term  accounting  for 
bias  and  the  other  accounting  for  the  variance  when  bias  is  ignored, 
Haitovsky  was  able  to  show  that  the  variance  term  was  far  more  important 
in  the  large  difference  observed  in  the  two  methods.  He  concluded 
that,  although  the  bias  affects  the  relevance  of  the  inference,  the 
major  problem  with  Method  2 is  caused  by  the  inconsistency  introduced  into 
the  system  of  normal  Equations  (14). 

Buck  (6)  treats  the  problem  of  missing  values  among  the  dependent 
variables  in  a multivariate  linear  model  by  estimating  the  missing  values 
by  regression  techniques  and  then  calculating  a revised  variance-covariance 
matrix.  He  represents  the  sample  of  n experimental  units  by  expressing 
the  responses,  y^-  (i  = l,2,...,n;  j = l,2,...,p),  in  the  form  of  an 
nxp  matrix,  Y,  in  which  some  of  the  elements  are  missing.  Assuming  that 
k of  the  n p-variate  responses  are  complete,  he  lets  these  form  the  first 
k rows  of  y and  then  calculates  the  expected  value  of  y . (r  = l,2,...,k) 
by' forming  for  each  value  of  j,  the  multiple  regression  0f  the  jth  variable 
on  the  other  p-1  variables  from  the  set  of  observations  consisting  of  the 
first  k rows  of  Y.  Thus,  he  obtains  p equations  which  can  be  expressed 
as 

E(yrj)  = fj(yrl’  yr2’  yrj-Tyr,j+l*  "*  yrp^  (15) 
The  missing  values  are  then  estimated  as  follows.  If  the  ith  unit  has 
t h 

the  j observation  missing,  its  value,  y. .,  is  estimated  by  one  of  the 

* J 

Equations  (15)  substituting  y^  for  yrj , that  is, 

E (yi j ) = Vyil’  yi2’  ••••  yij-r  yij+l ylp^* 

This  formulation  assumes  only  one  missing  value  in  each  incomplete  response 

9 


but  can  be  extended  to  the  case  in  which  units  have  more  than  one  missing 
value.  Buck  shows  that  if  the  value  y.  is  missing  for  a proportion  A 

J 

of  all  experimental  units,  and  the  predicted  values  are  substituted  and 
a new  variance-covariance  matrix  calculated,  then  the  expectations  in  this 
matrix  are  the  same  as  they  would  be  if  there  were  no  missing  values, 
except  for  the  variance  v'^  of  y^  which,  in  terms  of  expectations, is 


1.  L. 

where  v..  is  the  jtn  diagonal  element  of  the  variance  covariance  matrix, 

J J 

say  V,  that  would  result  if  there  were  no  missing  elements  and  c..  is  the 

J J 

jth  diagonal  element  in  V"\ 

Beale  and  Little  (5)  propose  a solution  to  the  problem  of  missing 
observations  in  the  dependent  variables  of  a multivariate  normal  linear 
model  based  on  the  Missing  Information  Principle  of  Orchard  and  Woodbury 
(11)  which  involves  approximating  the  Maximum  Likelihood  solution  through 
an  iterative  technique.  The  argument  of  Beale  and  Little  follows  that 
of  Orchard  and  Woodbury  but  emphasizes  that  the  effect  of  the  principle 
is  to  replace  a maximization  problem  by  a fixed  point  problem.  They  construct 
a conditional  likelihood  function  composed  of  the  likelihood  equation  for 
known  values  plus  a conditional  likelihood  of  unknown  values  given  the 
known  values  and  then  show  that  a stationary  solution  to  the  conditional 
likelihood  equation  is  equivalent  to  the  Maximum  Likelihood  solution 
based  on  the  original  likelihood  equation.  Thus,  assuming  the  nxp  observa- 
tion matrix,  Y,  is  distributed  as  a Multivariate  Normal,  they  group  the 
observations  into  two  vectors  ^ and  z^with  a joint  distribution  depending 
on  the  vector  G of  parameters,  where  £ has  been  observed  but  z_  has  not 


been  observed.  To  approximate  the  Maximum  Likelihood  Estimate  (MLE) 

10 


G,  of  G,  based  on  the  log  likelihood  L(^;  0 ) , they  suggest  maximizing  the 
expected  value  of  L(z_,y.;0)  where  z is  treated  as  a random  variable  with 
some  known  distribution.  Thus,  letting  f(z/^;G)  denote  the  probability 
density  function  for  the  conditional  distribution  of  ^ given  and  0, 
and  letting  L(z/y/,0)  denote  ln[f (z/y/,0  )],  then 
L(z,y_;0)  = L(y;o)  + L(z/y/,0). 

A distribution  is  defined  for  z_  by  taking  any  assumed  value  0^  for  0 along 
with  the  observed  value  of  and  one  can  then  take  expectations  of  both 
sides  of  Equation  (16),  Integrating  with  respect  to  z^  This  Is  expressed  by 
Ea(z, £;©)/£;©£}  = L(^;g)  + E{L(z/y_;0)  |y_;  0^.  (17) 

They  then  find  the  value  of  0 that  maximizes  the  left  hand  side  of 
Equation  (17)  and  write 

©M  = 0(0^  08) 

since  0^  may  depend  on  0..  Thus,  Equation  (18)  represents  a transfor- 
mation from  the  vector  to  the  vector  0^  from  which  the  Missing 
Information  Principle  originates.  The  Missing  Information  Principle 
involves  estimating  e by  a fixed  point  of  the  transformation,  namely 
a value  of  0 such  that  0 = 0(g). 

As  mentioned  in  the  introduction,  Kleinbaum  (9)  proposes  a solution 
to  the  problem  of  estimation  and  hypothesis  testing  for  the  MGLM 
model  which  is  applicable  to  the  case  involving  missing  observations  among 
the  dependent  variables  in  the  SM  model  with  known  design  matrix.  He 
writes  the  SM  model  in  the  form 

E(Y)  = Xa  and  (19) 

Var(Y)  = In  r z 

where  X is  an  nxm  known  design  matrix  of  rank  R(X)  = r(<m<n), 

a is  an  mxp  matrix  of  unknown  parameters, 
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and  i = {[ors))  is  a pxp  positive  definite  matrix  of  usually  unknown 
parameters  representing  the  variance-covariance  matrix  of  any 
row  of  Y. 

1L 

Letting  ^ be  the  nxl  vector  denoting  the  s column  of  Y and 

X L. 

the  mxl  vector  denoting  the  stn  column  of  a,  he  writes  the  variate-wise 
representation  of  the  SM  model  as 

e(£;)  = x«5  » Var(ys)  = °ss  In,  s = 1,  • ••,  p;  (20j 
= °rs  !n  when  r^s. 

Then  stacking  the  observation  vectors  on  top  of  one  another,  the  vector 
representation  of  the  SM  model  becomes 

E (y)  = DxJ*  (21) 

Var(y)  = n 

where  D = I a X and  n = z a I . 

X p n 

From  these  representations  Kleinbaum  develops  a general  form  of  the  model 

which  allows  the  omission  of  responses  from  variates  not  observed  on  a 

given  experimental  unit.  For  the  case  involving  missing  observations 

among  the  dependent  variables  of  an  SM  model,  he  constructs  the 

generalized  model  as  follows.  Assuming  there  are  n experimental  units 

and  p response  variates  Vp  in  total,  he  lets  z^,  s = 1 p be 

the  vector  of  length  N , say,  corresponding  to  all  observations  on  V 

5 s 

in  the  entire  experiment  and  lets  X$  be  the  N$xm  design  matrix  corresponding 
to  z^  , i.e.,  X$  is  determined  from  X by  ommitting  the  rows  which 
correspond  to  missing  values  of  y^.  He  then  lets  the  NrxN$  (r  < s)  matrix 
(D  denote  the  incidence  matrix  of  0's  and  I's  defined  by  0 = (o  ) 

r 5 * * 7 


where 


rs  1j(rs)' 


1 if  the  ith  component  of  ^ and  the  jth  component  of 
Zg  are  observed  on  the  same  experimental  unit, 

0 otherwise 
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r ’ 


Thus,  the  variate-wise  representation  of  the  MGLM  model  is  given  by 


El^)  - Xs2s  Varlz^l  = asslN 


(22) 


c°*<£r'is>  ‘ °rs^rs  * r < s 


Cov(2r,zs)  = <>rsQ'rs-  >•  » >-.5  ■ '.--.P. 


and  with  the  above  definitions  the  vector  representation  of  the  MGLM  is 
given  by: 

(23) 


E(z)  = 


and  Var(z)  = n 


Ns  and 


M = l m,. 


Kleinbaum  then  shows  that  the  unique  BLUE  of  any  estimable  linear 
function  or  linear  set  of  the  treatment  parameters  is  given  by  a linear 
function  or  linear  set,  respectively,  whicn  involves  the  unknown 
parameters  of  the  variance  matrix  fi.  In  fact,  restricting  linear 
estimates  to  be  known  functions  not  involving  n requires  additional 
restrictive  conditions  on  the  model.  Therefore,  he  considers  Best 
Asymptotically  Normal  (BAN)  estimation  which  is  a nonlinear  method 
of  estimation  using  estimates  of  and  yielding  variances  that  are, 
in  large  samples,  the  minimum  that  could  be  achieved  by  linear  estimators 
if  Q were  known. 

For  testing  linear  hypotheses  in  the  MGLM  model,  assuming  the 
data  is  normally  distributed,  Kleinbaum  suggests  using  test  statistics 
which  are  quadratic  forms  called  Wald  Statistics  and  are  constructed 
from  BAN  estimators  of  linear  functions  of  the  treatment  parameters. 

Since  the  asymptotic  distribution  of  a Wald  Statistic  is  a central 
chi-square  variable,  the  test  criteria  yield  chi-square  tests  when  the 
sample  size  is  large. 

Attempts  have  been  made  by  several  authors  to  obtain  Maximum  Likelihood 
Estimates  (MLE ) of  the  parameters  in  a multivariate  linear  model  with 
missing  observations  among  the  dependent  variables.  However,  most  of 
these  methods  are  applicable  to  only  very  specific  models.  For  instance, 
Anderson  (2)  describes  an  iterative  technique  for  obtaining  the  MLE's 
of  ot=a'^pxlj  and  n when  Xg  is  an  (N$xl ) vector  of  ones.  Hocking  and 
Smith  (8)  have  developed  a procedure  for  obtaining  BAN  estimators  of 
a and  q for  the  multivariate  linear  model  with  missing  observations  among 
the  dependent  variables  and  they  have  shown  for  a special  case  that  their 


approach  yields  the  maximum  likelihood  solutions  obtained  by  Anderson. 
Their  estimation  procedure  involves  obtaining  initial  estimates  of  the 
parameters  from  the  group  of  observations  with  no  missing  values  and  then 
modifying  these  initial  estimators  by  adjoining  the  information  in  all 
the  remaining  groups  in  a sequential  manner  by  the  addition  of  linear 
combinations  of  zero  expectations.  However,  for  purposes  of  a general 
computer  program,  extremely  cumbersome  notation  would  be  required  to 
express  the  formulae  for  calculating  the  estimators  at  each  stage.  In 
fact.  Hocking  and  Smith  have  only  considered  a few  special  cases  which 
involve  simply  structured  models. 
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SECTION  III 


PROPOSED  SOLUTION 

It  appears  that.  If  it  were  possible  to  generalize  the  results  cited 
in  the  literature  which  deal  with  missing  observations,  at  best  one  would 
have  procedures  for  handling  missing  values  among  the  dependent  and/or 
independent  variables  in  a univariate  analysis  of  covariance  model  or 
missing  values  among  the  dependent  variables  in  a multivariate  analysis 
of  covariance  model.  The  general  form  of  the  SM  model  for  missing 
observations  among  the  dependent  variables,  as  discussed  by  Srlvastava 
(14)  and  Klei nbaum  (9) , does,  however,  appear  to  be  valuable  as  an  initial 
representation  of  a MAC  model  in  which  missing  observations  occur  among 
the  dependent  and/or  independent  variables.  In  fact,  the  results 
of  Kleinbaum  for  estimation  and  hypothesis  testing  in  the  MGLM  can  be 
generalized  to  the  More  General  Multivariate  Analysis  of  Covariance 
(MGMAC)  model  by  employing  a procedure  for  dealing  with  the  missing  in- 
dependent variables  similar  to  that  employed  by  Zyskind,  Kempthorne,  et 
al  to  deal  with  missing  dependent  variables  in  a univariate  linear  model. 

THE  MAC  MODEL  WITH  MISSING  DEPENDENT  AND/OR  MISSING  INDEPENDENT 
VARIABLES  (MGMAC) 

For  purposes  of  clarity  and  simplification,  the  general  form  of 
the  MGMAC  model  will  be  presented  by  first  rewriting  the  various  forms 
of  the  MAC  model,  then  generalizing  to  the  General  Multivariate  Analysis 
of  Covariance  (GMAC)  model  (i.e.,  the  MAC  with  missing  dependent  variables). 
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and  finally  by  extending  the  GMAC  to  the  MGMAC  model  ( i.e.,  with  missing 
dependent  and/or  independent  variables).  To  make  the  presentation  as 
brief  as  possible,  definitions  of  variables  and  parameters  previously 
defined  will  be  omitted  unless  specifically  needed  for  clarification. 

The  Multivariate  Analysis  of  Covariance  Model  (MAC)  can  be  repre- 
sented by 

E(Y)  = Xa  + Zb  (?4) 

Var(Y)  = I a z 
or  alternatively  by 

E(Y)  = Ay  where  A = [x  ; Z] 

Thus,  the  variate-wise  representation  of  the  MAC 

E(ys)  = Axj,  s = 1 ,. . . ,p  and 
Cov^,^)  = arsIn  for  all  r,s, 


(25) 


is  given  by 


= 1 p. 


(26) 


(27) 


and  the  vector  representation  is  given  by 
E(yJ  = DAX.  and  Var(yJ  = n 

where 

D.  = I a A and  ft  = e a I . 

A p n 

To  obtain  the  general  form  of  the  GMAC,  assume  there  are  n experi- 
mental units  and  p response  variates  V1,...,Vp  in  total.  Let  z^,  s = 1,. 
be  the  vector  of  length  Ns,  say,  corresponding  to  all  observations  on  Vs 
in  the  entire  experiment.  Let  A$^  s = l,...,p  be  the  design  matrix 
corresponding  to  z i.e.,  A$  is  determined  from  A by  deleting  those 

rows  which  correspond  to  missing  values  of  y^.  Let  9r  xN  r < s 

r s 

denote  the  incidence  matrix  of  0's  and  1 ' s defined  by  Q = (q..  ) 

rS  ’3(rs) 

where 


qij 


(rs) 


{ 


1 if  the  1th  component  of  ^ and  the  Jth  component  of 
y^  are  observed  on  the  same  experimental  unit 
°*  otherwise. 
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Then  the  variate-wise  representation  of  the  GMAC  is  given  by 

fls*s 


E(^) 


Var<j5>  ■ “ss'n 


(28) 


Cov<V^>  ■ <,rsqrs,  r < s 


^0\/{z,z)  = <T  0' 


r > s 


r,s  = 


The  vector  representation  of  the  GMAC  is  given  by 
E(zJ  = and  Var(z)  = ft 


(29) 


where 


— (Nxl ) 


*1 

A1  * 

• 

A2  ! 

• 

b 

’ D(NxM)~ 

* x(Mxl  f 

- -1 

* \ 

\(N  xm)  tne  vari ate-wise  representation  of  the  GMAC  model  the  l 


th 


column  will  have  ^ = ^t  - ks>  where  is  the  number  of  experimental 


units  in  which  both  the  independent  variable  in  column  £ of  As  and  the 


dependent  variable  on  variate  V$  are  missing.  Thus,  A$  would  have 


r°n  in1 

°12  QI2 

. 

0lpQlp 

^(NxN)  ~ 

°12Q'l2 

°22IN2 

°2pQ2p 

jV'lp 

°2pQ,2p 

. 

°PP\ 

N 

= V N 
s=l  5 

and 

M 

= mp. 

To  obtain  the  general  form  of 

the  MGMAC, 

assume 

that  the  design 

matrix  A = [ X ; Z]  of 

the  MAC  model  has  ^t 

missing 

observations 

in  the  column,  (i  = 

m +1 , • • • , 

vv- 

Then  in 

the  design  matrices 

» * 

j 
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] 


m 

*«  ■ i- 


- - „t  missing  values. 

S o,1  S 

Then  replace  As  by  F$  where  F$  is  derived  from  A$  by  augmenting  A$  (with 
o's  in  place  of  missing  values)  by  a matrix  A*  of  dimension  (N  xt  ) composed  of 


t columns  each  with  a one  in  the  row  position  corresponding  to  the 


missing  values  in  A$  and  zeros  elsewhere.  [Note:  F$  has  dimension 


(N  xm  ) where  m = m + t .]  Thus,  the  variate-wise  representation 
s s s s 


of  the  MGMAC  is  given  by 

E<V  ■ FsV 


Var(^>  = °rsIN 


(30) 


CovUr  • h'1  ■ °rs^rs ’ r < s 


Cov(z  ,z  ) = a O' 
-r’-s'  rswrs 


r<s,  s = 1 ,. . . ,p 


where 


6 

-S 


and  where 


6^  is  a (tsxl)  vector  of  unknown 


parameters  due  to  the  missing  values  in  A . 

s 


The  vector  representation  of  the  MGMAC  model  is  given  by: 
E(z_)  = F^  and  Var(z)  = n 


(31) 


where  F 


(NxM) 


F]  <J) 

V 

F2 

» 1 = 

£-2 

* *F„ 

5 

L p_ 

_%>  J 

N = 


s=l 


N„  and  M 


s = l 


ESTIMATION  FOR  THE  MGMAC  MODEL 

Theorem  1 : If  0 = H'£  = !c  C'jL  is  estimable,  and  if  E is  known  then 

s=l 

H has  a unique  BLUE  given  by 
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0 = H'i  = H' (F'n_1F)_  F'n_1z 
whose  variance-covariance  matrix  is  given  by 
Var(0)  = H' (F'fi-1F)_H 

where  C$  is  a known  (msxl)  vector  (s  = 1,...,  p) . 

If  we  restrict  our  estimators  to  be  known  linear  functions  of  z, 
then  we  cannot  use  the  0 above  unless  it  is  independent  of  n . 

A 

Theorem  2;  For  the  MGMAC  model  0 is  not  independent  of  n unless  the 
following  conditions  are  satisfied: 

C,s(F,sFsrF,s^rs  e V<FV*  r < s 

C,s(F'sFs}’F,sQsr  e V(F's)*  r>s  where 

’WVM*  r < s (r’s  = 1»---  » P) 
is  defined  as  before. 

If  the  above  conditions  are  not  satisfied,  one  is  lead  to  consider 
nonlinear  methods  of  estimation  which  use  estimates  of  z and  which  give 
variances  that  are,  in  large  samples,  the  minimum  that  could  be  achieved 
by  linear  estimators  if  2 were  known. 

Theorem  3:  A BAN  estimator  which  is  unbiased  for  any  estimable  set 
0 = H'£,  is  given  by 

0^  = H'i  = H' (F,n"1F)“F,n~1a  (32) 

A A 

where  0 is  obtained  from  n by  substituting  the  elements  of  Z = (8r$) 
given  in  Theorem  4 below  for  the  corresponding  elements  of  Z = (a  ). 
Theorem  4:  For  the  MGMAC  model,  a consistent  and  unbiased  estimate  of 

A A 

Z is  given  by  Z = (a  ) where 

rs 

i's  [\  - Fs<F'sFs>'F's]  V s ■ ’ P <33> 

and 
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r 


rs  N -R(T~7 
rs  1 rs ' 


(r,s  = l,2,...,p). 


rs 


K ' frs<F'rsFrs>'F'rs] 
*-  rs  -1 


z , r i s 
-sr 


where  N ( >2 ) is  the  number  of  experimental  units  on  which  V is  observed, 

r - r 

Nrs(>2)  is  the  number  of  experimental  units  on  which  both  V„  and 


-rs 


Vs  are  observed  together, 

(N  xl)  1sthe  vector  of  all  observations  on  Vr, 


-rs(N  xl)’  r*s  is  the  vector  of  observations  on  Vr  which  correspond 
toHinits  on  which  both  and  are  observed  together, 

F \ is  the  design  matrix  correspondi ng  to  z and 

r(Nrxmr)  r 3 


Frs(N  xm  ) the  desi9n  matrix  correspondi ng  to  z 


-rs 


The  proof  of  the  above  theorem  follows  easily  from  Kleinbaum  (9). 

Theorem  5:  For  the  MGMAC  model,  the  asymptotic  variance  matrix  of  any  BAN 
estimator  of  an  estimable  linear  set  H'£,  where  H^xw^  is  of  full  rank 
w,  is  given  by 

Note:  H'(F'n  V)  H is  the  same  as  the  variance  matrix  of  the  unique 

A A 

BLUE  set  0 = H'£  for  H'^  when  n is  known. 


TESTING  LINEAR  HYPOTHESES  FOR  THE  MGMAC  MODEL 


Theorem  6:  For  the  MGMAC  model,  let  H'£  be  estimable  where  H^Mxy^ 
and  of  full  rank  w.  Then,  if  the  null  hypothesis  is  HQ:  H'£  = 0, 


is  known 


W 


(H'l)'  [h*  (f* n"1  f)"h]  ■'  (H'O 


-1 


(34) 


is  asymptotical ly  distributed  as  a central  chi-square  variable  with  w 
degrees  of  freedom,  where 

n = n 

* 9 

z = z 
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Z is  any  positive  definite  consistent  estimator  of  Z, 

U and  F are  defined  by  the  vector  representation  and  H'£  is  any  BAN 
estimator  of  This  result  is  easily  extended  from  Kleinbaum  (9). 

To  test  the  hypothesis  HQ:  H'£  = 0,  we  may  thus  reject  Hp  if 
Wn  e x^,l-a  and  accept  otherwise. 

NOTE:  All  the  above  theorems  follow  easily  from  similar  theorems 
by  Kleinbaum  (9) . 


A'l  EXAMPLE 

f The  following  example  is  given  to  illustrate  the  procedures  outlined 

I in  Section  III  for  testing  hypotheses  and  for  obtaining  parameter  estimates. 

The  data  consists  of  a portion  of  data  from  an  exercise  in  Morrison  (10). 

The  samll  sample  size  was  chosen  only  in  order  to  make  the  problem  manageable 
for  hand  computations . The  dependent  variables  represent  two  character- 
istics of  urine  specimens  of  young  men  classified  into  two  groups  according 
to  their  degree  of  obesity.  One  measure,  specific  gravity,  was  selected 

as  a concomitant  variable.  The  observations  on  these  variates  and  the 

f 

concomitant  variable  are  given  below  (blank  spaces  represent  missing 
observations): 


If  there  were  no  missing  observations,  the  model  could  be  written  In 
the  form  of  Equation  (24)  or  (25).  However,  since  observations  are  missing 
from  columns  of  Y and  A (or  Z),  writing  the  model  in  the  form  of  Equation  (25) 


results  in  blanks  in  the  Y and  A matrices  as  shown  below: 

E(Y)  = Ay  and  (35) 


Var(Y)  = I a z where 
n 


- 

- 

’ 

17.6 

5.15 

1 

0 

24 

13.4 

5.75 

1 

0 

32 

20.3 

4.35 

1 

0 

17 

22.3 

7.55 

1 

0 

30 

20.5 

8.50 

1 

0 

30 

18.5 

Blank 

1 

0 

Blank 

12.1 

5.95 

1 

0 

25 

12.0 

6.30 

1 

0 

30 

10.1 

5.45 

, A = [X  : z]  = 

1 

0 

28 

B1  ank 

3.75 

1 

0 

24 

18.1 

9.00 

0 

1 

31 

19.7 

5.30 

0 

1 

B1  ank 

16.9 

9.85 

0 

1 

32 

23.7 

3.60 

0 

1 

20 

19.2 

Blank 

0 

1 

18 

18.0 

4.40 

0 

1 

23 

14.8 

7.15 

0 

1 

31 

15.6 

7.25 

0 

1 

28 

16.2 

5.30 

0 

1 

21 

_ fa]  _ 

L6J 

a~ 

a21 

.1 

and  z = 

°11  CI12_! 

al  2 

a22 

1 

_°12  °22J 

ea 

621 

Transforming  to  the  vector  representation  [Equation  (29)]  illuminates  the 
problem  of  missing  observations  among  the  dependent  variables,  but  blanks 
remain  among  the  independent  variables  as  shown  below: 

E(z)  = Dy  and  (36) 

Var(z)  = n where 


24 


To  transform  to  the  vector  representation  [Equation  (31)]  of  the  MGMAC 
model,  A1  is  replaced  by  F1 , x1  is  replaced  by  ^ , A2  is  replaced  by  and 
is  replaced  by  ^ as  shown  below: 

E(z)  = F_§.  and  (37) 


Var(z)  = a 


where 


as  defined  before, 


c with  F,= 

1>  F0  1 


1 0 24  0 0 

1 0 32  0 0 

1 0 17  0 0 

1 0 30  0 0 

1 0 30  0 0 and  F, 

10  0 10  1 

0 0 25  0 0 

1 0 30  0 0 

1 0 28  0 0 

0 1 31  0 0 

0 1 0 0 1 

0 1 32  0 0 

0 1 20  0 0 

0 1 18  0 0 

0 1 23  0 0 

0 1 31  0 0 

0 1 28  0 0 

0 1 21  0 0 


1 0 24  0 

1 0 32  0 

1 0 17  0 

1 0 30  0 

1 0 30  0 1 

1 0 25  0 ; 

1 0 30  0 { 

1 0 28  0 1 

1 0 24  0 ! 

0 1 31  0 

o i o i : 

0 1 32  0 

0 1 20  0 ; 

0 1 23  0 i 

0 1 31  0 j 

0 1 28  0 : 

0 1 21  0 1 


Estimation  of  i 


Using  Theorem  4,  a consistent  and  unbiased  estimator 

2 = (ors)  (38) 


of  I is  obtained  by  letting 


F.|  and  are  defined  as  before  and 


F12=F21= 


1 0 24  0 
1 0 32  0 
1 0 17  0 
1 0 30  0 
1 0 30  0 
1 0 25  0 
1 0 30  0 
1 0 28  0 
0 1 31  0 
0 10  1 
0 1 32  0 
0 1 20  0 
0 1 23  0 
0 1 31  0 
0 1 28  0 
0 1 21  0 


Substitution  of  the  above  values  into  Equation  (38)  results  in 


I 


13.9694 

1.7376 


1.7376 

1.3775 


ESTIMATION  OF  ORIGINAL  PARAMETERS 

A BAN  estimator  which  is  unbiased  for  H'£ 


“ll 

“12 

611 

“21 

“22 


is  given  by 


B 
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H'£  = H*  (F'n"1FrF,n“1z  where 


(39) 


is  obtained  from  u by  substituting  the  elements  of  Z given  above 
for  the  corresponding  elements  of  z , 


100000000 
010000000 
001  000000 


H = 


000001000 

000000100 

000000010 


» F « 

F-|  (ji 

r 

and  z = 

^1 

(Ji  Fg 

k 
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Substitution  of  the  appropriate  values  into  Equation  (39)  yields 


®n 

21.3524 

“12 

23.0662 

*11 

-0.2071 

a21 

-4.0121 

a22 

-3.2103 

! 

CM 

< c a 

/ 

0.3686 

In  order  to  have  obtained  the  parameters  introduced  into  the  model 
by  the  missing  observations  among  the  Independent  variables  In  addition 
to  the  original  parameters,  H = Ig  would  have  been  used  in  Equation  (39). 

HYPOTHESIS  TEST  - NO  OVERALL  GROUP  EFFECT 


The  joint  null  hypothesis  of  no  overall  group  effect,  which  can  be 
wri  tten 


H = 


H. 


H 


1 

-1 

0 

0 

0 

0 

0 

0 

0 


0 

0 

0 

0 

0 

1 

-1 

0 

0 


“11  ’ a12 

0 

a21  " a22 

1 

o 



where 


and 


L = 


‘11 

‘12 

*11 

hi 

S12 

‘21 

‘22 

S21 

S21 
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is  tested  by  computing 


Wn  = (Hi)1  [H*(F,n"1F)'H]"1  (H'£)  (40) 

Substitution  of  the  appropriate  values  Into  Equation  (40)  results 
in  Wn  = 3.037  which  yields  an  observed  significance  level  between  0.1 
and  0.25  based  on  the  fact  that  Wp  is  asymptotically  distributed  as  a 
central  chi-square  with  R(H)  = 2 degrees  of  freedom.  Based  on  the 
results  of  this  test,  it  could  be  concluded  that  at  the  commonly  accepted 
levels  of  significance  there  is  not  sufficient  evidence  to  reject  the 
joint  null  hypothesis  of  no  difference  in  the  characteristics  of  urine 
specimens  of  young  men  in  Group  I and  Group  II. 


appendix  a 


LIST  OF  SYMBOLS 


a(q  x 1) 


A ( p x q)  = (a^) 


A 

R(A) 
V (A) 
A' 

tr  A 

A'1 

A" 


A a B 


is  a (qx  1)  column  vector,  and  a'  is  the  corresponding 
(1  x q)  row  vector. 

is  a (p  x q)  matrix  with  a.,  as  the  element  in  the 

■ J 

t h t h 

itn  row  and  j n column. 

is  a partitioned  matrix  in  which  A^  is  the  sub-matrix 

in  the  i u row  and  jtn  column. 

is  the  rank  of  the  matrix  A. 

is  the  vector  space  spanned  by  the  rows  of  A. 

is  the  transpose  of  A. 

is  the  trace  of  A. 

is  the  unique  inverse  of  a square  matrix  A of  fun  rank, 
is  any  generalized  inverse  of  the  matrix  A and  is  defined 
by  AA”A  = A. 

is  the  Kronecker  Product  of  the  matrix  A and  B 
defined  by  A ■ B = (a.,B)  where  A = (a..)  . 

is  the  identity  matrix  of  order  q. 
is  the  (pxl)  vector  of  zeros, 
is  the  ( pxq ),  matrix  of  zeros. 
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For  )<  = (x^ , . 
Cov  (x.y.) 

Var(x) 

For  Y(n  x p) 
E(Y) 

Var(Y) 


• ,xn)'  and  ^ = (y-, ....  ,ym) ' , 


is  the  (n  x «■)  matrix  with  Cov(x^,yj)  in  the  ith  row 
and  column; 

is  the  (n  x n)  matrix  Cov(x,x). 

(y..)  , a matrix  of  random  variables, 

is  the  (n  x p)  matrix  of  expectations  of  the  elemtents 
of  Y,  i.e.,  E(Y)  = (Eyid)  ; 
is  the  (np  x np)  variance-covariance  matrix  of 
the  (np  x 1)  vector  defined  by  putting  the  rows 
of  Y underneath  each  other  in  a long  column  vector 
means  that  the  random  variable  x has  a p- variate 
multinormal  distribution  with  mean  vector  u_  and 
variance-covariance  matrix  z. 

is  the  square  root  of  a symmetric  matrix  n defined 
by  ft 2 = C 1 AC  where  C is  an  orthogonal  matrix 
and  A is  a diagonal  matrix  such  that  n = C'A2C. 

is  read  as  "the  variable  t with  left  subscript 


.x  ~ Np(_^,t) 


i" . 

is  read  as  "the  variable  t with  left  subscript 
i and  right  subscript  s". 
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APPENDIX  B 
GLOSSARY  OF  TERMS 

Given:  A p-variate  random  sample,  Y(nxp)  of  size  n from  a population 
with  probability  density  function  f(Y,G),  0 e 0 (parameter 
space,  then: 

An  estimator,  T,  of  the  parameter  g(O),  G e 6 is  a function  of 
Y whose  range  contains  the  range  of  g(G). 

An  unbiased  estimator,  T,  of  g(G)  is  one  such  that 

E(T)  = g(0),  vG  e e . 

Denote  the  class  of  unbiased  estimators  of  g(G)  by  Ug. 

A Minimum  Variance  Unbiased  Estimator  (MVUE)  of  g(G)  is  a T e Ug  such  that 

Var(T)  < Var(T*) , vT*  e Ug  and  G e e. 

Let  Vg  be  the  class  of  all  linear  unbiased  estimators  of  g(G). 

Then  T e Vg  if  and  only  if 
(i)  T e Ug  and 

(ii)  T e a_'Y  where  a_  is  some  constant  vector. 

A Best  Linear  Unbiased  Estimator  (BLUE),  T,  of  g(G)  is  a T e Vg  such  that 
Var(T)  < Var(T*) , v T*  e Vg  and  G c 0. 

A sequence  of  random  variables  (Zp:  n = 1,  2,  •••)  converges  in 

distribution  to  the  random  variable  Z with  distribution  function 
F whenever 

11m  Fn(z)  = F(z),  for  all  continuity  points 
n+P  d 

x of  F.  This  is  denoted  by  Zn  “ F,  where  Fn  is  the  distribution 
function  of  Zp(n  = 1,  2,  •••)- 
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An  estimator  based  on  a sample  of  n observations  is 

said  to  be  a Best  Asymptotic  Normal  (BAN)  estimator  for  the 
parameter  = (q^  ....  ej  - provided 


0 (i„  - 2o>  i Wu’  "fere 


n(uxu)  ” (risher's  Information  Matrix 


= E 


e 

-o 




1 log  $n 

n a e2 

£*So_ 

where 


% is  the  likelihood  function  for  the  sample* 


b 

0 has  asymptotic  dispersion  matrix  — . 
1 n 
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